78 PART 2 Examining Tools and Processes

Making Forgivable (and

Non-Forgivable) Errors

A central concept in statistics is that of error. In statistics, the term error some-

times means what you think it means — that a mistake has been made. In those

cases, the statistician should take steps to avoid the error. But other times in sta-

tistics, the term error refers to a phenomenon that is unavoidable, and as statisti-

cians, we just have to cope with it.

For example, imagine that you had a list of all the patients of a particular clinic

and their current ages. Suppose that you calculated the average age of the patients

on your list, and your answer was 43.7 years. That would be a population param-

eter. Now, let’s say you took a random sample of 20 patients from that list and

calculated the mean age of the sample, which would be a sample statistic. Do you

think you would get exactly 43.7 years? Although it is certainly possible, in all

likelihood, the mean of your sample — the statistic — would be a different num-

ber than the mean of your population — the parameter. The fact that most of the

time a sample statistic is not equal to the population parameter is called sampling

error. Sampling error is unavoidable, and as statisticians, we are forced to accept it.

Now, to describe the other type of error, let’s add some drama. Suppose that when

you went to take a sample of those 20 patients, you spilled coffee on the list so you

could not read some of the names. The names blotted out by the coffee were there-

fore ineligible to be selected for your sample. This is unfair to the names under the

coffee stain — they have a zero probability of being selected for your sample, even

though they are part of the population from which you are sampling. This is called

undercoverage, and is considered a type of non-sampling error. Non-sampling error

is essentially a mistake. It is where something goes wrong during sampling that

you should try to avoid. And unlike sampling error, undercoverage is definitely a

mistake you should avoid making if you can (like spilling coffee).

Framing Your Sample

In the previous example, the patient list is considered your sampling frame. A sam-

pling frame represents the practical representation of the population from which

you are literally drawing your sample. We described this list as a printout of

patient names and their ages. Suppose that after the list was printed, a few more

patients joined the clinic, and a few patients stopped using the clinic because they

moved away. This situation means that your sampling frame — your list — is not

a perfect representation of the actual population from which you are drawing

your sample.